69 research outputs found

    CFD Solvers with Minimal Memory Access

    Get PDF
    Many state of the art CFD codes that exhibit low computational intensity (flops per RAM access) "saturate" the memory bandwidth of modern chips after only a few cores, thus minimizing any benefits from going to a higher number of available cores. This bottleneck is expected to become even more pronounced for future manycore systems. This has led to the quest for CFD solvers with minimal memory access. We report on recent developments and results for Finite Difference and Edge-Based Finite Element solvers. The best of these implementations yield one residual for only 6 fetches and 4 stores, regardless of the size of the stencil (and therefore the discretization order). This means that in terms of memory access they are competitive even with finite difference stencils as low as 2 (typical of CFD codes with 2nd order spatial discretization of fluxes and 4th order damping). Timings for a low Mach number finite difference code using a 6th order spatial discretization show competitive timings as compared to conventional loops. This bodes well for future HPC architectures.Publicado en: Mecánica Computacional vol. XXXV, no. 1.Facultad de Ingenierí

    CFD Solvers with Minimal Memory Access

    Get PDF
    Many state of the art CFD codes that exhibit low computational intensity (flops per RAM access) "saturate" the memory bandwidth of modern chips after only a few cores, thus minimizing any benefits from going to a higher number of available cores. This bottleneck is expected to become even more pronounced for future manycore systems. This has led to the quest for CFD solvers with minimal memory access. We report on recent developments and results for Finite Difference and Edge-Based Finite Element solvers. The best of these implementations yield one residual for only 6 fetches and 4 stores, regardless of the size of the stencil (and therefore the discretization order). This means that in terms of memory access they are competitive even with finite difference stencils as low as 2 (typical of CFD codes with 2nd order spatial discretization of fluxes and 4th order damping). Timings for a low Mach number finite difference code using a 6th order spatial discretization show competitive timings as compared to conventional loops. This bodes well for future HPC architectures.Publicado en: Mecánica Computacional vol. XXXV, no. 1.Facultad de Ingenierí

    Prediction of wear via DEM and phenomenological models

    Get PDF
    The Discrete Element Method (DEM) is a computational method used to describe the movement of a large number of particle of different sized and shapes, which interact through a contact model. Among other applications, in the field of mining DEM have been used extensively for predicting the trajectory of material inside Semi-Autogenous Grinding (SAG) mills and in the chutes of minerals transfer. However, no calculations that predict the wear of the enclosing walls have been performed to date. After an extensive review of the literature, a methodology to predict wear via DEM and phenomenological wear models has been developed. The decision was taken to use Archard's model (one of the simplest yet most accurate models proposed to date) in the context of DEM. Given that the wear occurs in a matter of weeks or months, and that a DEM run of even a minute can consume copious amounts of computer resources, a separation of timescales was implemented. For each stage of the overall cycle, the present configuration is run for a relatively small amount of physical time (from T0 to T1) in order to get the statistics of wear. For a mill, this could be a few rotations. For all the faces on the boundaries, the wear is updated every time step. At the end of the DEM run, the total change in volume is used to compute a `recession speed' for each face. The recession speed is then used to extrapolate the recession distance (i.e. the wear) from T0 to a much larger time T2. Once the surface is moved via the recession distance, the run is restarted and the cycle repeats. The result obtained to date show that the methodology is able to compute realistic wear patterns with CPU requirements that are acceptable in an engineering design environment.Publicado en: Mecánica Computacional vol. XXXV, no. 7.Facultad de Ingenierí

    Improvements in speed and scalability of a DEM code

    Get PDF
    A number of near-optimal techniques were implemented to reduce computing times for the Discrete Element Method (DEM) code named DESOL. Among these, the following showed the largest improvements: multilevel bins, periodic rebuild, trimming and Symmetric Multiprocessor (SMP) parallelization. These improvements have led to Central Processing Unit (CPU) reduction of the order of 1:3-1:5 on scalar machines, while also showing excellent scalability up to the point of memory saturation, which on current Intel Xeon processors occurs at approximately 8 cores for double precision and 16 cores for single precision.Publicado en: Mecánica Computacional vol. XXXV, no. 10.Facultad de Ingenierí

    Improvements in speed and scalability of a DEM code

    Get PDF
    A number of near-optimal techniques were implemented to reduce computing times for the Discrete Element Method (DEM) code named DESOL. Among these, the following showed the largest improvements: multilevel bins, periodic rebuild, trimming and Symmetric Multiprocessor (SMP) parallelization. These improvements have led to Central Processing Unit (CPU) reduction of the order of 1:3-1:5 on scalar machines, while also showing excellent scalability up to the point of memory saturation, which on current Intel Xeon processors occurs at approximately 8 cores for double precision and 16 cores for single precision.Publicado en: Mecánica Computacional vol. XXXV, no. 10.Facultad de Ingenierí

    Prediction of wear via DEM and phenomenological models

    Get PDF
    The Discrete Element Method (DEM) is a computational method used to describe the movement of a large number of particle of different sized and shapes, which interact through a contact model. Among other applications, in the field of mining DEM have been used extensively for predicting the trajectory of material inside Semi-Autogenous Grinding (SAG) mills and in the chutes of minerals transfer. However, no calculations that predict the wear of the enclosing walls have been performed to date. After an extensive review of the literature, a methodology to predict wear via DEM and phenomenological wear models has been developed. The decision was taken to use Archard's model (one of the simplest yet most accurate models proposed to date) in the context of DEM. Given that the wear occurs in a matter of weeks or months, and that a DEM run of even a minute can consume copious amounts of computer resources, a separation of timescales was implemented. For each stage of the overall cycle, the present configuration is run for a relatively small amount of physical time (from T0 to T1) in order to get the statistics of wear. For a mill, this could be a few rotations. For all the faces on the boundaries, the wear is updated every time step. At the end of the DEM run, the total change in volume is used to compute a `recession speed' for each face. The recession speed is then used to extrapolate the recession distance (i.e. the wear) from T0 to a much larger time T2. Once the surface is moved via the recession distance, the run is restarted and the cycle repeats. The result obtained to date show that the methodology is able to compute realistic wear patterns with CPU requirements that are acceptable in an engineering design environment.Publicado en: Mecánica Computacional vol. XXXV, no. 7.Facultad de Ingenierí

    High-order interpolation between adjacent cartesian finite difference grids of different size

    Get PDF
    Nested cartesian grid systems by design require interpolation of solution fields from coarser to finer grid systems. While several choices are available, preserving accuracy, stability and efficiency at the same time require careful design of the interpolation schemes. Given this context, a series of interpolation algorithms for nested cartesian finite difference grids of different size were developed and tested. These algorithms are based on post-processing, on each local grid, the raw (bi/trilinear) information passed to the halo points from coarser grids. In this way modularity is maximized while preserving locality. The results obtained indicate that the schemes improve markedly the convergence rates and the overall accuracy of finite difference codes with varying grid sizes.Publicado en: Mecánica Computacional vol. XXXV, no. 15Facultad de Ingenierí

    Towards Real-Time Monitoring of the Hajj

    Get PDF
    An automated approach to explore the fundamental properties of high-density pedestrian traffic is outlined. The framework operates on video or time lapse images captured from surveillance cameras. For pedestrian velocity extraction, the framework incorporates cross-correlation based Particle Image Velocimetry (PIV) techniques. For pedestrian density estimation, the framework relies on the Machine Learning technique of the Boosted Regression Trees. The information collected from images in pixel coordinates are transformed to world coordinates with a pin-hole camera based projective transformation technique. The framework has been tested with high density crowd images acquired during the Muslim religious event, the Hajj. Accuracy and performance of the framework are reported
    corecore